Algorithmes d'analyse syntaxique par grammaires lexicalisées : optimisation et traitement de l'ambiguïté
نویسنده
چکیده
The present work is about automatic parsing of written texts using lexicalized grammars and large coverage language resources. More specifically, we concentrated our work on three domains : algorithmic, easy development of NLP applications useful in an industrial context, and deep syntactic parsing. Concerning the first point, we implemented new algorithms for the optimisation of local grammars before their use for parsing and we propose an efficient algorithm for the application of this kind of grammar on text. Our algorithm enhance the processing of lexical and syntactic ambiguities and we show that it scales well when processing big text corpora in combination with fine grained and large coverage language resources. Concerning the second point, we were actively commited to the development of the Outilex project, a generalist linguistic platform dedicated to text processing. By its modular architecture, the platform aims to provide easy development of high level hybrid NLP applications mixing symbolic and stochastic approachs. Finally, the third part of our researchs involves the exploitation of the lexicon-grammar tables for deep syntactic parsing and the identification of predicate-arguments structures in French texts. For this purpose, we enhanced the formalism of local grammars with the addition of features structure constraints. Those constraints make possible to declaratively solve in our grammar many syntactic phenomena and to formalize the result of syntactic parsing. We present our grammar for French in its current state, which is semi-automatically generated from the lexicon-grammar tables, and we show some evaluation of its lexical and syntactic coverage.
منابع مشابه
Un analyseur Syntaxique interactif pour la Communication homme-Machine
Nous envisageons la r&lisation d'un syst~me de communication Homme-Machine en langues naturelles, qui pourrait 8tre utilis6 par exemple pour l'dtude de la langue elle-mSme ou pour la rdalisation d'un syst~me questions-r~ponses sttr un sujet d~termind (consultation d'tme banque de donn~es et r6ponses approprides). Avant d'en arriver ~t la rdalisation du module s~mantique, il faut se d~finir un m...
متن کاملAnother Facet of LIG
In this paper we present a new parsing algorithm for linear indexed grammars (LIGs) in the same spirit as the one described in (Vijay-Shanker and Weir, 1993) for tree adjoining grammars. For a LIG L and an input string x of length n, we build a non ambiguous context-free grammar whose sentences are all (and exclusively) valid derivation sequences in L which lead to x. We show that this grammar ...
متن کاملUne Experience Pratique D'Utilisation De L'Analyse Linguistique En Recherche D'Information: Bilan & Perspectives
Le programme PIAF est constitue par un ensemble de modules d'analyse linguistique. D6j~ pr6sent6 par ailleurs, nous rappellerons que l'objectif 6tail d'obtenir un outil suffisamment performant pour permettre l'analyse du texte fibre, en faisant appe[ ~ un principe d'interaction avec l'utilisateur. En particulier, il est toujours possible de modifier grammaires et dictionnaires en cours d'analys...
متن کامل